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Abstract 



A composition analysis of KASCADE air shower data is performed by means of 
unfolding the two-dimensional frequency spectrum of electron and muon numbers. 
Aim of the analysis is the determination of energy spectra for elemental groups 
representing the chemical composition of primary cosmic rays. Since such an analysis 
depends crucially on simulations of air showers the two different hadronic interaction 
models QGSJet and SIBYLL are used for their generation. The resulting primary 
energy spectra show that the knee in the all particle spectrum is due to a steepening 
of the spectra of light elements but, also, that neither of the two simulation sets is 
able to describe the measured data consistently over the whole energy range with 
discrepancies appearing in different energy regions. 



1 Introduction 

The energy spectrum of primary cosmic rays, extending over more than 12 
decades in energy, follows, over a large range, a simple power law d.J/dE 
oc E 1 indicating its non-thermal character. However, in the region between 
1 PeV and 10 PeV a change of the spectral index from 7 m —2.65 to 7 m —3.1 
occurs, the so-called knee in the spectrum of cosmic rays. Since its discovery [1] 
nearly 50 years ago many measurements have been performed in this energy 
range (see e.g. [2] for recent measurement results), but the origin of the knee 
feature is still not convincingly explained. 

Proposals for its origin range from astrophysical scenarios like the change of 
acceleration mechanisms [3,4,5] at the sources of cosmic rays (supernova rem- 
nants, pulsars, etc.) or effects due to the propagation [6,7] inside the Galaxy 
(diffusion, drift, escape from the Galaxy) to particle physics models like the 
interaction with relic neutrinos [8] during transport or new processes in the 
atmosphere [9,10] during air shower development. Common to all models is 
the prediction of a change of composition over the knee region. Moreover, in 
order to distinguish between individual models, knowledge of the energy spec- 
tra of individual elements or at least mass groups of primary cosmic rays is 
desired since the different models predict different spectral shapes. 

Because of the low fluxes of cosmic rays only indirect measurements via the 
detection of extensive air showers (EAS) induced by primary cosmic ray parti- 
cles in the atmosphere are feasible at present in the energy range close to and 
above 1 PeV. Determination of spectra for individual elements or mass groups 
is limited by the large intrinsic fluctuations of EAS observables. Furthermore, 
any analysis of air shower data has to rely on EAS simulations and our lim- 
ited knowledge of particle physics in the energy range of relevance. Since the 
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primary energies of the showers are beyond the energy range of man-made 
accelerators and reactions relevant to shower development occur in the very 
forward direction not accessible in collider experiments, uncertainties in the 
description of hadronic interactions in shower development are unavoidable. 
One has, therefore, to rely on the use of phenomenological interaction models 
which differ in their predictions in some respect strongly, making the task of 
retrieving information about individual energy spectra from air shower data 
even more difficult. Approaches facing these difficulties by using statistical 
methods and extensive comparisons with simulations can be found e.g. in 
[11,12]. 

In this paper we present an analysis of the classical EAS observables, elec- 
tron and muon numbers, which deals with these problems. Because of the 
high accuracy of the KASCADE experiment the presented method, based on 
unfolding procedures, is capable of reconstructing energy spectra for five ele- 
ments representing different mass groups of primary cosmic ray particles. The 
analysis is performed twice using simulations with two different high energy 
hadronic interaction models, QGSJet [13] and SIBYLL [14]. This approach 
gives also a lower limit of the uncertainties due to the modelling of hadronic 
interactions. It turns out that the analysis is sensitive to the different models 
allowing to identify inconsistencies between simulations and data and to give 
hints to improve the models. 

After a brief description of the experimental setup and the data used in Section 
2 the idea and the approach of the analysis are outlined in Section 3. Here the 
main objective is the formulation of the relation between the measured two- 
dimensional shower size spectrum and the primary energy spectra as matrix 
equation. Mathematical details of this procedure are given in Appendix A.l. 
In this equation all relevant EAS and reconstruction properties are contained 
in the so-called response matrix. Section 4 deals with the description of the 
distributions necessary for the calculation of the matrix elements. In Section 5 
unfolding as a method for solving the matrix equation is introduced whereas 
in Section 6 its application to Monte Carlo data is discussed. The results of 
the unfolding analysis applied to measured data are presented in Section 7 
and discussed in Section 8, followed by the conclusions. 



2 The KASCADE experiment and data selection 

The KASCADE (Karlsruhe Shower Core and Array Detector) experiment [15] 
investigates air showers in a primary energy range from 100 TeV to 100 PeV 
and measures a large number of observables for each event: electrons, muons at 
four energy thresholds, and energy and number of hadrons. The main detector 
components of KASCADE are the field array [15], the central detector [16,17], 
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Fig. 1. Left: layout of the KASCADE air shower experiment. Right: sketch of a 
detector station with shielded and unshielded scintillation detectors. 

and the muon tracking detector [18]. In the present analysis only data from 
the field array is used. Detailed descriptions of the experimental setup and 
reconstruction procedures of the main shower observables can be found in 
[15,19]. 

The field array measures electrons and muons (E^ > 230 MeV) in the shower 
separately using an array of 252 detector stations containing shielded and 
unshielded detectors, arranged on a square grid of 200 x 200 m 2 with a spacing 
of 13 m. These stations are organized in 16 so-called clusters, each consisting 
of 16 stations in the outer part and 15 stations in the inner part of the array, 
respectively. Fig. 1 displays a sketch of the installation and of a detector 
station. 

The array observables used in the following are the total electron number 
N e and the truncated muon number Nfj~ . The latter is the number of muons 
with distances to the shower core between 40 m and 200 m. Input of the 
analysis is the two-dimensional shower size distribution with respect to these 
two observables. It is displayed in Fig. 2. The zenith angle of the showers in 
the analysis is restricted to values between 0° and 18°. In order to ensure a 
high quality of the reconstructed shower observables the following cuts are 
applied: 

• The location of the reconstructed shower core has to lie inside a circle of 
91 m radius around the center of the array. In this way an erroneous recon- 
struction of showers with cores outside the array can be mostly avoided. 

• The reconstructed age-parameter s of the fit with the NKG-function to the 
lateral distribution of electrons has to be inside the interval 0.2 < s < 2.1. 
Values larger or smaller correspond to poorly reconstructed showers which 
are mostly small but may be reconstructed with large shower sizes [19]. 
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Fig. 2. Two-dimensional shower size spectrum used in the analysis. The range in 
lg N e and lgNj[ is chosen to avoid influences of inefficiencies. 

• Only measurement runs with all clusters active are considered. Missing 
clusters during measurement strongly influence the measurement and re- 
construction thresholds. 

• To avoid threshold effects only showers with showers sizes lgiV e > 4.8 and 
lgiV* r > 3.6 are included. 

The total number of events remaining after these cuts amounts to 6.9 • 10 5 and 
the effective measurement time adds up to 900 days. This rather small number 
of remaining showers is due to the severe cuts applied in order to guarantee 
a high data quality. As will be seen in the following, the remaining statistical 
base is sufficient and not the limiting factor for the reliability of the results. 



3 Outline of the analysis 

Starting point of the analysis is the two-dimensional shower size spectrum 
and the contents (number of events) of the histogram cells displayed in Fig. 2. 
In the following each cell of the shower size spectrum is labeled by a single 
index i for identification. The number of events in each cell i results from 
the superposition of contributions induced by different primary particles with 
various energies. In this sense information about the primary energy spectra of 
all particle types is present in each cell and the analyzing task is to disentangle 
this information. 
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Mathematically the content of a specific cell i of the two-dimensional spectrum, 
i.e. the number of showers iVj with shower sizes (lgiV e , \gNj[)i of cell i, is 
related to the flux of primary cosmic ray elements via the integral equation: 

N l = 2nA s T m Y / J J j T ^PA((lgN e ,\gN t ;) i \\gE)sm9cos9dlgEd9 (1) 

where dJ^/d lg E denotes the differential flux of an element with mass number 
A and the summation is carried out for all elements present in the primary 
cosmic radiation. The conditional probability pa describes the probability to 
measure a shower of primary energy lg E and primary mass A with shower sizes 
(lgiV e ,lgiV* r )j. Measurement time T m and sampling area A s can be treated 
as constants. For the data range considered no dependence on azimuth angle 
is found which results in the factor of 2n. Any dependence on solid angle is 
therefore reduced to the integration over zenith angle ranging from 0° to 18°. 

The probability pa itself is an integral: 

+oo +oo 

PA= J J s A e A r A d lg N l r e d lg N^ true (2) 

— oo — oo 

where s A = s A (\gN* rue AgNj[' true \lgE) are the intrinsic shower fluctuations 
describing the probability for a shower with primary mass A and energy 
IgE to exhibit shower sizes lgN* rue and \gNj[' true at observation level. e A = 
e A (\g iVf" 16 , lg j\r* r,irne ) represents the detection and reconstruction efficiency 
which depends on the true shower sizes. The probability r A = ^((lg N e , lg iV* r ) i 
\g N* rue Ag Nj[' true ) eventually describes the properties of the reconstruction 
procedure. It accounts for the resolution of the reconstruction algorithms and 
systematic effects like under- and overestimation of the shower sizes due to the 
used fit functions or saturation effects of the detectors, e.g. (see section 4.3.2 
for details). In addition, all these quantities (especially the shower fluctuations 
sa) depend in principle on zenith angle. 

Using the notation of Eqs. (1) and (2) the data histogram of Fig. 2 is inter- 
preted as a system of coupled integral equations. In order to solve this set 
of equations for the energy spectra dJ A /d\gE it will be reduced to a matrix 
equation. The reformulation of the integral equations as a matrix equation is 
straightforward and explained in detail in Appendix A.l. With the data vector 
Y, whose elements yi are the cell contents Ni of Fig. 2, i.e. the two-dimensional 
shower size spectrum, and the vector of unknowns X, which represents the 
energy spectra of the individual primary particle types, the problem can be 
written as 

Y = RX (3) 
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where R is the so-called response or transfer matrix. The elements of R relate 
the energy spectra to the observed shower size spectrum via the probabilities 
for measuring the observables lgiV e , IgNjJ" of an air shower induced by a 
primary particle with mass A and energy IgE. All physical information about 
the air showers as well as the detection and reconstruction properties are 
contained in the response matrix R. Any results regarding energy spectra 
depend crucially on the knowledge of the matrix elements. The calculation 
of the matrix elements, i.e. the determination of the quantities sa, f'A-, an d 
e A was based on Monte Carlo simulations using the CORSIKA [20] program. 
Due to the impossibility to account in the analysis for all elements present in 
cosmic rays, we confine ourselves to five elements representing individual mass 
groups: hydrogen (proton), helium, carbon (CNO-group), silicon (intermediate 
elements) and iron (heavy component). 



4 Determination of the matrix elements 

4-1 Simulation strategy 

For the calculation of the matrix elements one has to rely on simulations in 
order to determine the shower fluctuations, efficiencies, and reconstruction 
properties. The corresponding simulated distributions are parameterized to 
simplify the numerical integrations. This approach allows also the investigation 
of the influence of unknown tails of the shower fluctuations, which are poorly 
determined statistically, on the result. This gives at least an estimate of this 
systematic uncertainty. The following simulation strategy is pursued: 

(1) The relevant shower size distributions are determined and parametrized 
for a set of fixed primary energies. These simulations are carried out using 
the CORSIKA code with the high energy interaction models QGSJet 01 
and SIBYLL 2.1. For the low energy interactions the GHEISHA 2002 [21] 
code is used. The electromagnetic part of the showers is simulated using 
the EGS4 [22] code. In addition the thinning option [23] for a faster sim- 
ulation was enabled. The energy dependence of the relevant parameters 
of the shower size distribution is interpolated. The simulated energies are 
0.1 PeV, 0.5 PeV, 1 PeV, 3.16 PeV, 10 PeV, 31.6 PeV, 100 PeV, 316 PeV 
and 1 EeV and the value of the thinning level is e = 10~ 6 for all ener- 
gies. The number of simulated showers for the corresponding energies is 
8000, 6000, 4000, 3000, 2000, 1500, 1000, 750, and 500, respectively, dis- 
tributed between 0° and 18°. A comparison between showers simulated 
with different thinning levels e and without thinning was carried out at 
a primary energy of 1 PeV in order to chose a thinning level for which 
inescapable additional artificial fluctuations are sufficiently small. The 
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relevant shower size distributions s £ A of the simulation sets with thinning 
were tested for compatibility with the corresponding distribution sa de- 
fined by the simulation set without thinning. This was done by means 
of a Kolmogorov-Smirnov test. In addition, a comparison between the 
shower size distributions of simulation sets using different thinning lev- 
els was performed at primary energy 1 EeV to cross-check the energy 
independence of e. 

(2) For the determination of efficiency and reconstruction properties a second 
set of CORSIKA simulations was used which consists of unthinned air 
showers, followed by a detailed GEANT [24] simulation of the KASCADE 
detectors and reconstruction by the standard KASCADE reconstruction 
software. The initial air showers are generated according to a continuous 
energy spectrum between 10 14 eV and 10 18 eV following a power law with 
differential index 7 = —2. This procedure was also performed for the two 
interaction models QGSJet 01 and SIBYLL 2.1. 



4-2 Determination of shower fluctuations sa 



The most important distribution for the calculation of the matrix elements is 
the correlated lgiV e — lgA"* r - probability distribution, i.e. the shower fluctu- 
ations 5,4. Their parameterization is carried out in two steps. For the param- 
eterization of the IgAe - distribution for a fixed primary particle and energy 
the following function is used: 

p(\gN e \ IgE) = po ■ erf ( ^"^ • exp (p 3 • (lgiV e - p 4 )) • (p 4 - lg JV e ) P5 (4) 



Here p(lg N e \ lg E) is the probability density for lg N e and po is a normalization 
constant. The notation erf(x) is the integral of a Gaussian with mean and 
variance 0.5 between —00 and x. It turned out that the parameters p 3 and 
p 5 can be treated as energy independent whereas p±, p 2 , and p 4 depend on 
primary energy. For values of lgiV e larger than p 4 the probability density is 
assumed to be zero. 

To describe the correlation between electron number and truncated muon 
number it is useful to look at the fraction Q of showers with muon number 
above some fixed value lg Nj['° as function of the electron size lgiV e . This 
fraction can be well described by an error function with varying width 

lgiV e -lgiVo \ 
p 6 -p 7 (\gN -\gN e ))> 1 ] 



Q(lgAUgA^°|lg£) = erf 
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where lgiVo is a parameter depending on the value of \gNj['°. For the relation 
between lg N and lg N^ a quadratic dependence was assumed: 

lg N = c + d • lg <'° - c 2 • (lg N^f. (6) 



Using this fraction Q and the probability density p(lgiV e | IgE) the correlated 
probability P(lg N e , lg Nf[\ lgi?) for a shower of primary energy IgE to have 
shower sizes lgiV e and lgNj[ can be written as 

= (QilgN^gNp-QilgN^lgN^ + dlgNp) p(lg JV e | IgE) d\gN e 
= s A d\gN e d\gN t ; (7) 

This parameterization of the shower fluctuations for fixed particle type and 
energy is used for the numerical evaluation of the matrix elements. 

To determine the free parameters of Eq. (7) it is more practicable to determine 
first the parameters of Eq. (4) via a fit to the electron size distribution and 
afterwards the muon parameters of Eq. (5) by fitting the truncated muon 
number distribution. The form of the latter one can be described using Eq. (7) 




lgN e 




lgN e 



Fig. 3. Left: Parameterization of the electron size distribution (proton, 0.5 PeV, 
QGSJet) according to Eq. (4). The quality of the fit is indicated by the value of 
X 2 per degree of freedom. Right: Dependence of the fraction of showers, Q, with 
lgiV* r > 3.2 on lg-/V e for the same simulated showers. The displayed function is a 
fit with Eq. (5). 
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Fig. 4. Left: Distribution of muon number for proton induced showers (0.5 PeV, 
QGSJet), together with a fit according to Eq. (8). Right: lgiVe — lgNj[ distribution 
of simulated showers (proton, 0.5 PeV, QGSJet) and the corresponding parameter- 
ization. 

by 

^i^.J ^M ^, (g) 

— oo 

As examples, some distributions for fixed primary energy and the correspond- 
ing fits are illustrated in Fig. 3 and Fig. 4. 

Investigation of the energy dependence of the various parameters showed that 
for each primary species the parameters p%, p$, p?, ci, and C2 can be treated 
as energy independent. Furthermore, for each primary particle type the same 
values of p^ can be used which indicates that the electron number at shower 
maximum is almost independent of primary mass. The energy dependence of 
the remaining five parameters is interpolated using polynomials. As an exam- 
ple, the parameters p\ and P2 for proton induced showers (QGSJet simulations) 
are displayed in Fig. 5. 

4-3 Properties of the reconstruction 

For the investigation of the reconstruction properties, the set of fully simulated 
(no thinning) CORSIKA showers is used, including a GEANT based simula- 
tion of the KASCADE experiment. In these detector simulations all properties 
of the detectors and the electronics are accounted for. The reliability of the 
simulations was checked independently e.g. by comparison between simulated 
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Fig. 5. Interpolated energy dependence of parameters p\ and p 2 of Eq. (4) for the 
case of proton induced showers (QGSJet simulations). 

and measured single muon spectra recorded by the array detectors. The output 
of the simulations has the same data structure as measured events. Therefore, 
simulated and measured showers are indistinguishable for the reconstruction 
process and can be treated with the usual KASCADE reconstruction algo- 
rithms. 



4-3.1 Estimate of tA 

Although the data range for the following analysis is chosen in a way to min- 
imize influences from possible efficiency variations, it is useful to parametrize 
the combined trigger and reconstruction efficiencies for the calculation of the 
response matrix elements. Details of trigger and reconstruction efficiencies 
can be found in [15]. Since the number of fired detectors depends, to very 
good approximation, on electron shower size only, the trigger efficiency can be 
well approximated by an integrated Gaussian distribution, depending only on 
lg Nl rue . 

The reconstruction of a measured shower is only successful if both, electron 
and muon number, can be determined. The probability of a successful recon- 
struction depends on the muon number only, as the determination of N e is 
possible for every triggered shower. The combined efficiency e = e(N e , Nf[) 
for triggering the measurement and successful reconstruction can be parame- 
terized by the product of two error functions: 

r/ \ kN e -po , \gN tr -p 2 
e = erf (a) • ert(o) with a = , = (9) 

Pi P3 

For the considered zenith angle range typical values for proton induced showers 
are p = 3.75, p\ = 0.16, p 2 = 2.32, and p 3 = 0.36. A slight dependence on 
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primary particle type was found for the parameters po and p±. In conclusion, 
all showers with lgN e > 4.4 and lg Nj[ > 3.4 trigger the experiment and are 
reconstructed successfully, regardless of primary particle type. 



4-3.2 Parameterization oJta 

Of further importance for the determination of the response matrix is the dif- 
ference between reconstructed shower size and its true value. In the case of the 
electron number the mean value of the difference A lg N e = lg N e — lg N l J ue 
shows a dependence on the difference lg N l J ue — lg Nj[ between true electron 
and reconstructed truncated muon number. This correlation is displayed in 
the left part of Fig. 6. This dependence proved to be independent of primary 
particle type and primary energy and is used for a parameterization of A lg N e 
depending on lgN l J ue — lgNf[, which defines a correction term C e to be sub- 
stracted from lg N e . The main reasons for this systematic effect are deviations 
between the observed lateral distribution and the NKG function used to de- 
termine the particle number which has to be integrated over the whole lateral 
distance range. The size of the systematic difference between true and recon- 
structed electron number is strongly correlated with the shower age which 
itself is strongly correlated with the ratio between electron and muon num- 
ber. A more detailed analysis of these interrelationships will be the topic of a 
forthcoming paper. 

The dependence of the corrected difference lg N e — lg jV*™ 6 — C e on true elec- 
tron number is displayed in the right part of Fig. 6. The increase towards low 




Fig. 6. Left: Difference between reconstructed and true electron number, dependent 
on the difference \gN^, rue — IgNff. Right: Remaining systematic difference after 
correction with the relation in the left part of the figure. Simulations were generated 
using QGSJet. 
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Fig. 7. Left: Distribution of lg N e — \gN* rue — C e for proton induced showers with 
4.2 < \gNfT ue < 4.3 fitted with a Gaussian distribution. Right: Dependence of the 
width a on the true electron number lg N^, rue . 



values of \gN]T ue is due to the combined threshold of measurement and re- 
construction, the decrease of lg N e — lg jV* rne — C e towards larger shower sizes 
reflects saturation effects in the array detectors which influence the quality of 
the reconstruction. These deviations from the zero line are parameterized and 
accounted for in the analysis. 

The distribution of lg N e — lg N* rue — C e for fixed true electron number lg N* rue 
can be described with good quality by a Gaussian. An example for this is 
displayed in the left part of Fig. 7. The form and the parameters of this distri- 
bution do not depend on primary particle type. The width of the distribution 
depends on lg Nl rue but can be easily parameterized which is displayed in the 
right part of Fig. 7. The adopted description of the reconstruction systematics 
and resolution of lg N e can be integrated into the calculation of the response 
matrix elements. The influence of resolution effects on the results are small 
since for the showers used (lgiV e > 4.8) the resolution is smaller than the bin 
width in lgiV e . 



In the case of the truncated muon number a correlation between the difference 
lg Nf[ — lg ]\f^ true and the true electron number lg N^ rue was found. This corre- 
lation, displayed in the left part of Fig. 8, proved also to be nearly independent 
of primary particle type. Using this correlation for the parameterization of a 
correction term any dependence of lgiV* r — \gNj[ ,true on lg Nj[ nearly 
vanishes. The mean values of IgNff — \gNff ,true — versus muon number 
\gNf[ ,true are displayed in the right part of Fig. 8. Deviations from the zero 
line for small and large values of the muon number have similar reasons as in 
the case of lg N e and are accounted for. 
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, , true T , T tr, true 

Fig. 8. Left: Relation between lg Nff — lg Nj[' true and lg N l ^ ue used for the correction 
of lg-/V* r . Right: Remaining systematic differences between reconstructed and true 
truncated muon number after correction with the relation in the left part of the 
figure. 



In contrast to the case of the electron number the distribution of — 
lg ]\f t ^ true — is asymmetric for smaller values of lg N t ^ true but becomes more 
and more symmetric with increasing \gNff ,true . For values \gNj[ ,true > 4 the 
distribution can be described again by a Gaussian. In Fig. 9 these distributions 
are displayed for two narrow intervals of lg J\f^ true . In order to describe the 
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Fig. 10. Dependence of parameters p\ (left) and P2 (right) of Eq. (10) on the true 
muon number lgNj[' true . 

asymmetric distribution the following functional form was used 

r(x) = { Cl - 6 "^ r X>Pl ~ P i (10) 
cic 2 ■ e?3 x < px - ^ 



with the factor normalization constant. The value of p\ tends to zero 

with increasing \gN t J' true . The dependence of p\ and P2, which can be consid- 
ered as a measure for the resolution of the muon number determination, on 
the true muon number are displayed in Fig. 10. 



5 Solving the matrix equation 



5. 1 Application of unfolding methods 



From a purely mathematical point of view, solving the matrix equation Eq. (3), 
only requires a simple inversion of the matrix R . However, a closer inspec- 
tion of this matrix shows, that it is close to singularity and Eq. (3) states an 
ill-conditioned problem. Therefore, a direct inversion would give meaningless 
results. 

The reason for the poor condition of the response matrix is closely related 
to the properties of shower fluctuations. Since for different primary particles 
the corresponding distributions overlap to a large extent, the discrimination 
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between these particles gets more and more difficult with increasing number 
of particle types. In the extreme case of very similar particles, like for example 
nitrogen and oxygen, the corresponding matrix elements would coincide inside 
the computational accuracy for a reasonable binning of the data. In this case R 
is singular, and an inversion impossible. Therefore, the number of considered 
particle types has to be restricted. 

Another reason lies in the steeply falling primary spectrum. Due to their broad 
shower fluctuations low energy primary particles may be registered at high 
electron and muon numbers. Although the probabilities for this are very small 
this may be compensated by their high flux. This is reflected by a few very 
small matrix elements (in the order of 10~ 5 and smaller). As a result, nearly 
identical rows and columns consisting of very small values are present in R 
even when only one primary particle type is considered. Again, this leeds to a 
nearly singular matrix. In addition, also many small off-diagonal elements are 
introduced in the response matrix which are sensitive to rounding errors. 

Altogether, the response matrix R exhibits nearly identical rows and columns 
and many small off-diagonal elements. In such case, inversion of a matrix is in 
general an ineffective strategy, and one has to rely on methods which approx- 
imate the solution avoiding the problems inherent to matrix inversion. One 
class of methods especially suited for the determination of approximate solu- 
tions of ill-conditioned matrix and integral equations are so-called unfolding 
or deconvolution techniques. For this there exist many different algorithms, 
each one with its own systematic properties. To get a measure for the size 
of the systematic errors caused by the unfolding three different methods are 
used in the present analysis. These are the Gold algorithm [25], unfolding 
based on the Bayesian theorem [26] and an entropy based unfolding method 
[27]. Characteristic of these procedures is the generation of only non-negative 
solutions. Properties of these methods and details about their application are 
briefly presented in Appendix A. 2. 



5.2 Considered primary elements 

For unfolding techniques to be applicable the matrix equation (3) has to ex- 
hibit a minimum degree of stability for the algorithms to provide meaningful 
solutions. This stability is characterized by the condition number of the re- 
sponse matrix, which is strongly influenced by the number of primary particle 
types included in the analysis. One is restricted to a maximum number of 
elements since otherwise this would lead to a singular matrix. The relevant 
quantity here is the condition number defined by the ratio of the biggest to 
the smallest singular value of a matrix. The larger its value, the poorer is 
the condition of the matrix. Acceptable values are in the range of 10 6 to 10 7 , 
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depending on the the specific problem. 

To find maximum number of particle types, the number was varied, and in 
each case an unfolding procedure performed. For this investigations results of 
QGSJet 01 based simulations were used. The quality of the results was judged 
by means of a x 2 -comparison with the measured data (see section 8.2 or 8.3 for 
details). To determine the condition number of the corresponding matrices a 
singular value decomposition was performed. For the use of only two primaries 
(H and Fe) a x 2 per degree of freedom of 245 was achieved, for three particles 
(H, C, Fe) a value of 35, for four particles a value of 3.3 in the case of H, He, C, 
and Fe, and 2.5 for the use of H, He, C and Si, respectively. For five elements 
(H, He, C, Si, Fe) a value of 2.38 for \ 2 P er degree of freedom was found. At 
the same time the condition number increases from 2.3 • 10 5 (H, Fe) to 1.7- 10 6 
(H, C, Fe) and 4 • 10 6 for four primaries up to 8.5 ■ 10 6 in the case of H, He, 
C, Si and Fe. In addition, a significant increase in the statistical uncertainty 
of the solution with the transition from four to five primaries was observed. 

Due to the already large condition number in the case of five elements and 
only small improvement in the description of the data by the solution, finally 
five primary particle types are adopted for the analysis. These are hydrogen 
(protons), helium, carbon, silicon, and iron. The spectra of proton and helium 
will describe the energy spectra of single elements, whereas the three other 
types represent elemental groups only, carbon essentially the CNO-group, sil- 
icon the intermediate, and iron the heavy elements. Furthermore, it is not 
possible to specify from which elements of these groups the resulting energy 
spectra stem. 



6 Monte Carlo tests 

Before applying any of the unfolding algorithms to measured data it has to 
be tested if the method is suited for the actual problem. In order to get 
an estimate of the capabilities and sensitivity of an algorithm it is tested 
in an "ideal environment". According to an assumed energy spectrum for 
each primary particle type energy values are randomly chosen. Electron and 
muon numbers for each energy value are generated by Monte Carlo techniques, 
using the parametrized shower fluctuations and reconstruction properties, and 
a two-dimensional shower size spectrum, in range and binning identical with 
the measured one, is filled. The generated data correspond to approximately 
one third of the KASCADE data used. This artifical data histogram is input 
for the unfolding algorithms. 

The test procedure was carried out for different assumptions of the individ- 
ual energy spectra. Considered cases include knee features in each spectrum 
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at different energies, knee features at the same primary energy, knee features 
only in some of the elemental spectra, and only simple power laws (no knees 
at all). In all cases similar good results were achieved. In the following the test 
procedure and its results are presented for the example of a rigidity dependent 
knee. The assumed energy spectra follow a power law exhibiting a knee with 
the individual knee positions chosen to be proportional to the particle charges. 
All three unfolding methods yielded good and comparable results. As an ex- 
ample, the results of the Gold algorithm is presented here which is preferred 
because of its speed and robustness. A comparison between the results of the 
different unfolding algorithms is presented in section 7.1 and in Fig. 12 for 
the unfolding of the KASCADE data using QGSJet simulations. The "true" 
spectra of proton, helium, carbon, silicon, and iron are depicted by the open 
symbols in Fig. 11. Flux values and spectral indices below the knee are based 
on the compilation of [28]. 

To obtain reliable results, some criterion to stop the iteration is required. 
For the determination of the adequate number of iteration steps the weighted 
mean squared error (WMSE) and the relative variance of the bias (RBS) are 
used. These quantities are defined by 

1 m rr 2 4- h 2 1 m h 2 

WMSE = -V X K + 1 and RBS = -V4-, (U) 

where m is the dimension of the solution, A; the value of the ith element of the 
solution vector X and a 2 Xi its variance; 6, is the systematic bias from the true 
value and cr^j the statistical uncertainty of this bias. Details about their use in 
unfolding analyses can be found in [29]. For the determination of the WMSE 
and RBS a bootstrap method is used. The solution of the current iteration 
step serves as model for the generation of a set of Monte Carlo data which 
are deconvoluted. The obtained solution set is compared to the input spectra, 
i.e. the original solution. This method proved to work well for the estimation 
of the WMSE and RBS but provides a good estimate of the absolute value of 
the average systematic uncertainties only, and not for their sign. 

The result of the unfolding is shown in Fig. 11 for the Gold method. The open 
symbols correspond to the original "true" spectra, filled symbols represent 
the solution of the unfolding procedure. The left part of the figure displays 
the spectra of protons, helium, and carbon, the spectra of silicon and iron are 
shown in the right part. Error bars represent statistical errors whereas system- 
atic uncertainties are shown as shaded bands. Statistical errors are due to the 
limited number of simulated data and are estimated by repeating the unfold- 
ing with different sets. Systematic uncertainties are estimated by comparison 
of the mean value of the set of results with the values of the original "true" 
spectra. These systematic uncertainties are mainly due to the uncertainty in 
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Fig. 11. Unfolding results (filled symbols) for the energy spectra of H, He and C 
(left panel) and Si and Fe (right panel) together with the original "true" spectra 
(open symbols). The shaded bands are an estimate of the systematic uncertainties 
due to the applied unfolding method, in this case the Gold algorithm. 

terminating the iteration and the value of the regularization parameter, re- 
spectively. For all three unfolding methods this uncertainty is of order 15% 
for low energies, i.e. high fluxes. The strong increase of the systematic uncer- 
tainties at higher energies is due to the low fluxes and hence small number of 
showers. Since the considered algorithms are designed to generate only non- 
negative solutions they tend to introduce an additional bias in the case of 
small number of events. This bias gets significantly large for energies with less 
than « 30 events per bin. 

As can be seen in Fig. 11, the spectral features of the original spectra, like 
knee position and spectral index, are well reproduced within the statistical 
and systematic uncertainties. The artifical "wobbling" at low energies can be 
identified as a systematic relic of the unfolding procedure. Assuming a smooth 
behaviour of the original spectrum this yields an independent estimate for the 
size of the systematic uncertainties, again of order 15%. For the determination 
of the spectral shapes and the spectral indices these systematic effects have 
to be considered. Altogether, it can be concluded that the proposed analysis 
technique is applicable to the problem of unfolding the two-dimensional air 
shower size spectrum with five primary mass groups. 
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7 Results 



7.1 Results based on QGSJet 01 

All three unfolding procedures mentioned above were used in order to cross- 
check the solution. The result for the energy spectra of the light element 
groups (H, He, C) are shown in Fig. 12 for the three methods. The different 
unfolding results agree very well with each other. The same holds also for the 
heavy groups and also for the results based on SIBYLL simulations, presented 
in section 7.2. Therefore, in the following only the results using the Gold 
algorithm will be discussed which showed the highest speed and robustness 
among the applied methods. 

In addition to the statistical uncertainties due to the limited number of mea- 
sured showers and the systematic uncertainties due to the unfolding algorithm, 
two additional sources of uncertainties have to be considered. 

First, the number of simulated showers is limited, giving rise to further sta- 
tistical uncertainties of the fit parameters. To estimate this influence, each 
parameter of Eq. (7) is varied randomly within its error distribution. For each 
new set of parameters the energy dependence is interpolated and new response 
matrices are calculated. The unfolding is repeated with each set of response 
matrices and the spread of the individual fluxes determined. This additional 
statistical error is already included in the error bars in Fig. 12. 



Fig. 12. Results using QGSJet hypothesis for the elements H, He, C and for three 
different unfolding algorithms. For reason of clarity statistcal error bars are displayed 
for the results of the Gold algorithm only. 
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Fig. 13. Different extrapolations of the lg -/V e -distribution for 0.5 PeV proton induced 
showers (QGSJet 01). 

Second, the form of the tails of the shower size distributions is not known. 
Fig. 13 shows an example of the lg iV e -distribution for showers induced by 
0.5 PeV protons. Besides the parameterization used, two different extrap- 
olations are displayed, the first one with sharp cutoffs at the edges of the 
distribution, the second one with an exponential decrease up to higher and 
lower values of lgiV e . Within the statistics of the simulations each of these 
functions describes the distribution equally well. The influence of these tails 
on the shower size spectra and the unfolding result may be quite important 
because of the steeply falling primary energy spectra. The displayed parame- 
terizations in Fig. 13 can be regarded as extreme assumptions and it has been 
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Fig. 14. Unfolded energy spectra for H, He, C (left panel) and Si, Fe (right panel) 
based on QGSJet simulations. The shaded bands are an estimate of the systematic 
uncertainties due to the used parametrizations and the applied unfolding method 
(Gold algorithm). 
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investigated that the corresponding unfolding results form an upper and lower 
bound for the spectra. This range can be considered as an estimate for the 
systematic uncertainty due to the unknown shape of the distribution tails. 
It should be mentioned that the size of this systematic uncertainty should, 
according to simulations, be considerably reduced for observations close to 
shower maximum (e.g. around 5000 m a.s.L). 

In Fig. 14 the unfolding result is displayed together with the estimate of the 
total systematic uncertainty, shown as shaded bands. For low energies, the 
dominant contribution to the systematic uncertainty is due to the tails of the 
distributions. 

Below the knee helium is the most abundant element, followed by protons and 
carbon. The energy spectra of both proton and helium show a knee-like feature 
whereas for carbon no knee structure is visible. The spectra of the heavier 
elements look rather unexpected, especially in the case of iron. For energies 
below 10 PeV practically no iron is present, above 20 PeV it dominates the 
cosmic ray spectrum together with silicon. 



7.2 Results based on SIBYLL 2.1 



The outcome of the unfolding using CORSIKA /SIBYLL / GHEISHA for calcu- 
lation of the response matrices is presented in Fig. 15 for the Gold algorithm 
and five particle types. As in the case of the QGSJet analysis the different un- 
folding methods give essentially equal results. The estimated total systematic 
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Fig. 15. Unfolded energy spectra for H, He, C (left panel) and Si, Fe (right panel) 
based on SIBYLL simulations. The shaded bands are estimates of the systematic 
uncertainties due to the used parameterizations and the applied unfolding method 
(Gold algorithm). 
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uncertainties at lower energies are slightly smaller than for the QGSJet based 
results due to a better description of the measured data in the corresponding 
data range, which will be discussed in section 8.3. Each of the spectra of the 
light groups (proton, helium and CNO) shows a knee-like feature. The posi- 
tion of the individual knees is shifted to higher energies with increasing atomic 
number. In contrast to the QGSJet results, carbon is the most abundant el- 
ement at energies around 1-2 PeV but helium is again more abundant than 
hydrogen. 

The spectrum of silicon looks rather unexpected, exhibiting a knee-like struc- 
ture at around 3 PeV and decreasing very steeply above. Contrary to silicon, 
the iron spectrum looks very flat in this representation with a slight change of 
index to 7 ~ —2.5 above 10 PeV. This behaviour of the heavy group spectra 
will be discussed in section 8.3. 



8 Discussion 



8.1 All particle energy spectrum 



By summing up the five mass group spectra the all particle spectrum is ob- 
tained. It is displayed in Fig. 16 for both solutions. The estimated statistical 
uncertainties are shown by the error bars, the shaded band represents the es- 
timated systematic uncertainty, due to the applied method (Gold algorithm) 
and the parameterization of the tails of the shower size distribution, for the 
QGSJet results only. The corresponding band for the SIBYLL solution is of 
same size and omitted here for reasons of clarity. Tabulated values of the 
spectra are given in Appendix B. 

The knee is clearly visible for both cases. The spectrum is fitted with the 
expression [30] 



where p\ corresponds to the knee position, p 2 and p% are the spectral indices 
below and above the knee, and p 4 is a parameter describing the sharpness of 
the knee. In the case of the QGSJet 01 solution for the knee position a value of 
4.0 ± 0.8 PeV and for the spectral indices -2.70 ± 0.01 and -3.10 ± 0.07 were 
obtained. For the SIBYLL solution the corresponding values are 5.7±1.6 PeV, 
—2.70 ±0.06, and —3. 14 ±0.06. In both cases, the fit is insensitive to the value 
of P4 which was therefore fixed to a value of 4. The x 2 P er degree of freedom is 
0.35 in the QGSJet case, and 0.42 for the SIBYLL solution. Within statistical 
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Fig. 16. Result for the all particle energy spectrum using QGSJet and SIBYLL 
simulations in the analysis. The shaded band represents the estimated systematic 
uncertainties for the QGSJet solution which are of the same order for the SIBYLL 
solution. For reasons of clarity only the QGSJet band is displayed. 
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Fig. 17. All particle spectrum for the QGSJet 01 based analysis in comparison with 
results from RUNJOB [31], JACEE [32], Proton-3 [33], EAS-TOP [34], Tibet [35], 
HEGRA [36], Akeno [37], CASA-MIA [38], CASA-BLANCA [39], and DICE [40]. 
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uncertainties the results for the two interaction models coincide. It should be 
stressed that although the band of systematic uncertainties might suggest the 
possibility of a spectrum without a knee, each of the spectra defining this band 
exhibits a knee at around 5 PeV. 

This result is essentially independent of the interaction models used and in 
good agreement with results from other experiments. In Fig. 17 the QGSJet 
based results are displayed together with results from some other experiments. 
Concerning the flux at the knee and the knee position a very good agreement 
is especially reached with the HEGRA and the EAS-TOP experiments. 



8.2 Description of data - QGSJet based analysis 

To judge on the properties and the quality of the solution a vector Y con is 
"constructed" by forward folding of the solution according to Eq. (3) and a 
X 2 test is performed. This results in a value of Xdof = 2.38. The individual 
contributions xl °f interval i to this value are displayed in Fig. 18 as a twodi- 
mensional distribution. It is obvious that the obtained solution is not able to 
describe the data satisfactorily. 

The bulk of the deviations between measured and constructed data is concen- 
trated in the lower part of the measurement range, i.e. at low energies, with 
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Fig. 18. Distribution of the individual xf in the data range for the QGSJet solution. 
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Fig. 19. Comparison of measured data with the QGSJet based solution. Left: Elec- 
tron size distribution for 4 < lgNj[ < 4.1. Right: Electron size distribution for 
4.85 < lgiV* r < 4.95. 

a slight concentration in the region of small electron numbers (for fixed muon 
number), i.e. showers induced by heavy primaries. For higher energies (large 
shower sizes) the description of the data is quite well within the statistical 
uncertainties. 

To clarify the nature of these deviations, it is instructive to look at the lg N e - 
distribution for given lg Njf bins. Fig. 19 displays the measured distributions 
(points) for different lgiV* r bins together with the resulting distributions of 
the forward folding (histogram). In addition, the contribution of the different 
primary types are shown by smooth curves. As can be seen, a large contribu- 
tion of showers induced by light elements is needed for small muon number 
bins to describe the right tail of the distribution (large electron numbers) . As 
a consequence, no iron showers are needed for the description of the left-hand 
tail of the distribution. Even with practically no iron present at all there are 
still more showers with lg N e < 5 calculated than measured. The situation im- 
proves for higher energies (large muon numbers). First, the description of the 
distribution is better; second, now iron is also required to describe the mea- 
sured electron sizes. By investigating such figures the reason for the negligible 
iron flux at low energies for the QGSJet result can be understood. Investi- 
gations of the lg iV* r -distributions for different lg N e bins yield corresponding 
results. 

To summarize, showers generated using QGSJet seem to predict too many 
muons or too few electrons at low energies than required by the data. 
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8.3 Description of data - SIBYLL based analysis 



A similar comparison was performed using results based on the SIBYLL model. 
An overall value for Xdof °f 2.46 was obtained, being quite similar to the 
QGSJet case. Again the solution is not capable to describe the measured data 
in the complete region of measurement. The distribution of individual xh 
displayed in Fig. 20, is very different compared to the QGSJet based solution 
(Fig. 18). The bulk of the deviations is concentrated at medium to high IgNff 
and small lgiV e , i.e. in the region of heavy primaries with higher energies. 
Other than with the QGSJet solution, only small deviations occur at low 
energies. 

In Fig. 21 the measured electron size distributions together with the con- 
structed distributions are displayed for the same lgiV* r bins as in Fig. 19. The 
description of the lg iV e -distribution for low lg Nj[ bins is much better than for 
the QGSJet based result, only small deviations are found. Since the maximum 
of the lg iVe-distribution for carbon induced showers nearly coincides with that 
of the measured distribution, a high abundances of this mass group is found 
in case of the SIBYLL based solution. The good description of the low en- 
ergy data range by the SIBYLL based simulations is also the reason for the 
smaller systematic uncertainties in Fig. 15 at lower energies when compared 
to the QGSJet results. These uncertainties are dominated by the unknown 
shape of the tails of the shower fluctuations sa and reflect the stability of 
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Fig. 21. Comparison of measured data with the SIBYLL based solution. Left: Elec- 
tron size distribution for 4 < lgNj[ < 4.1. Right: Electron size distribution for 
4.85 < lgiV* r < 4.95. 

the solution against disturbances of the response matrix like changes of the 
distribution tails. This stability is highest for well described data, resulting in 
smaller uncertainties. 

In contrast, the description in the higher lg Nff range is much worse than be- 
fore. This can be seen in the right part of Fig. 21. In particular the left-hand tail 
of the lg iV e -distribution cannot be described using the five assumed particle 
types. In order to fit the distribution as well as possible the iron contribution 
has to be raised nearly to the maximum value allowed by the maximum of the 
observed distribution. On the other hand, the right tail towards higher values 
of lg N e can only be described using the lighter elements. As a result, there is 
no space left for a significant contribution of silicon which explains the sharp 
decrease in the silicon spectrum in Fig. 15. Whereas the data description at 
lower energies works quite well, at higher energies showers generated with the 
SIBYLL model seem to be too electron rich or too muon poor compared to the 
data. The same conclusion holds when investigating the lgiV* r -distribution for 
different lgiV e bins. 



8.4 Some qualitative considerations - open problems 



Despite the fact that none of the two hadronic interaction models is able to 
describe the whole data range consistently, it is possible to get hints, why their 
predictions do not match the data and how agreement with the data could be 
improved. In Fig. 22 a part of the two-dimensional size spectrum and in ad- 
dition some lines of constant intensity (isolines) are drawn. Overlaid are lines 
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representing the energy dependence of the maximum value of the probability 
distributions Pa(}s N e , lg Nf[\ lg E) of Eq. (1), i.e. the most probable pair of the 
shower size values. These lines of the most probable values are displayed for 
the primary particles hydrogen (proton) and iron and for the two simulation 
sets. 

It is noticeable that the lines of the most probable values for the two interaction 
models are nearly parallel. For SIBYLL simulations these lines are evenly 
shifted towards larger values of lg N e and smaller values of lg Nj[ with respect 
to the corresponding lines for QGSJet simulations. 

For the SIBYLL based simulations the line of the most probable values of iron 
showers tends to lie the more in the central region of the data the higher the 
energy and moves away from the lower edge (small lgA^ e for fixed lgiV* r ) of 
the data distribution. This lower edge is expected to be dominated by showers 
induced by heavy elements. Consistent with this relatively large distance to 
the lower edge are the discrepancies in the description of the measured data 
by the SIBYLL based results for higher energies (Fig. 21). One way to solve 
this problem would be a reduction of the predicted electron number in high 
energy SIBYLL simulations which would result in a decreasing slope of the 
corresponding lines of the most probable values. Another possibility might be a 
weaker decrease of shower fluctuations in SIBYLL simulations with increasing 
energy. 
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Fig. 22. Two-dimensional shower size spectrum of lg-/V e and lgNj[ together with 
isolines and lines of the most probable values for proton and iron induced showers 
for both simulations. 
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In the case of the QGSJet based solution the description of the data at higher 
energies is better whereas discrepancies occur at lower energies (see Fig. 19). 
In the region of small shower sizes no contribution of the heavy component is 
needed to describe the data. Referring to Fig. 22, one approach would be to 
shift the lines of the most probable values in the region of smaller shower sizes 
away from the lower edge of the data distribution. This would mean that more 
electrons or fewer muons are predicted for showers at low energies. Another 
possibility could be the reduction of shower fluctuations at lower energies for 
QGSJet based simulations. 

One might argue that the low energy hadronic interaction model could influ- 
ence the simulations and contribute to the differences. However, when using 
the FLUKA [41] instead of the GHEISHA code we found almost no changes 
for the electron and muon shower size distributions. Therefore, FLUKA is not 
able to improve the situation significantly. 

Although these considerations are qualitative they may give some hints for 
further improvement of hadronic interaction models. Investigations with a 
toy model which consists of simply shifting the predictions of SIBYLL based 
simulations towards QGSJet based predictions with increasing energy, resulted 
in a consistent description of the measured data. (We are aware that this is 
not a very reasonable procedure.) 

8.5 Systematic model uncertainties - the proton spectrum 

Since neither of the interaction models used in the analysis can describe the 
measured data consistently the results for the individual element spectra are 
not fully reliable. However, the difference between the results for the spectrum 
of one particular particle can serve as an estimate of the systematic uncertainty 
due to the hadronic interaction models. 

To visualize this uncertainty, it is instructive to compare direct and indirect 
measurements of protons at lower energies with the corresponding spectra of 
our analyses (Fig. 23). Due to the low abundances of elements lighter than 
carbon but other than proton and helium the results for proton would give 
the real spectrum of the single element in the case of correct simulations. 
Since the elements carbon, silicon, and iron stand for elemental groups, which 
are loosely defined, a comparison with data from direct measurements is not 
possible for these heavier elements. 

Despite the large difference between our two results they are in good agree- 
ment with the extrapolations of those of balloon-borne experiments for the 
proton spectrum. At present, the statistical uncertainties of direct measure- 
ments above 10 14 eV are of the same order of magnitude as the systematic 
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Fig. 23. Results for the proton energy spectrum for both of our analysis together with 
results from direct (AMS[42], BESS[43], CAPRICE[44], Ryan[45], SOKOL-2 [46], 
RUNJOB[31], JACEE[32]) and indirect (HEGRA [4 7], Tibet [48]) measurements. 

uncertainty of air shower based analyses due to the hadronic interaction mod- 
els. Further improvement requires a more reliable theoretical description of 
high energy hadronic interactions. 



9 Summary and conclusion 



Using the two-dimensional shower size spectrum of electron number lg N e and 
muon number lg Njf measured with KASCADE an analysis was presented 
yielding energy spectra for five primary mass groups, representing the chemical 
composition of cosmic rays. For this analysis, air shower simulations with two 
different high energy hadronic interaction models (QGSJet 01 and SIBYLL 
2.1) were used. The reconstructed all particle spectra for both simulation sets 
coincide within the statistical and systematic uncertainties and are consistent 
with results from other experiments. The knee is observed at an energy around 
~ 5 PeV with a change of index A7 « 0.4. The situation differs quite strongly 
when considering the results of the mass group spectra. Common is the ap- 
pearence of knee-like features in the spectra of the light elements. For both 
models the position of the knees in these spectra is shifted towards higher en- 
ergy with increasing element number. A closer inspection revealed that none 
of the two interaction models is capable of describing the measured data con- 
sistently over the whole measurement range. For the QGSJet based analysis 
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deviations occur at low energies whereas for the SIBYLL based analysis the 
higher energies are problematic. 

Summarizing, it has been demonstrated that unfolding methods are capable 
to reconstruct energy spectra of individual mass groups from air shower data, 
in addition to the all particle spectrum. At present, the limiting factors of the 
analysis are the properties of the high energy interaction models used and not 
the quality or the understanding of the KASCADE data. The observed dis- 
crepancies between simulations and data have to be attributed to the models 
and may give valuable information for their further improvements. 
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A The matrix equation and unfolding methods 

A.l Formulation of the problem as matrix equation 

The bin content iVj of each cell of the two-dimensional shower size spectrum 
displayed in Fig. 2 can be written as 

N A If +°° , j i 

N l = 2ixA s T m Y: j j j^PAiikNeAgN^lgE^s^edlgEde (A.l) 

-4=1 0° -oo ^ 

with the differential flux dJA/dlgE of an element of mass number A and the 
conditional probability pa describing the probability to measure a shower of 
primary energy IgE and primary mass A with shower sizes (lg N e , lg A* r )j. 
For sufficiently large showers, which are only included in the present analysis, 
measurement time T m and sampling area A s can be treated as constants and 
no dependence on azimuth angle is present, resulting in the factor 2n. 
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The probability p A can be expressed as 

+00 +00 

Pa= J J s A e A r A d lg N l ; ue d lg N^ true (A.2) 

—00 —00 

with the primary dependent intrinsic shower fluctuations s A , the properties 
of the reconstruction (resolution and systematic shifts) r A and the combined 
efficiencies for detection and reconstruction e A . 

Simulation studies have shown that at KASCADE efficiencies e A and recon- 
struction properties r A do not depend on zenith angle 9 for 9 < 20°. In addition 
the angular resolution of the KASCADE array in the considered shower size 
range is better than 0.2°, so effects due to limited angular resolution can be 
savely neglected. Since only showers with 0° < 9 < 18° are considered, the 
integration over zenith angle can be incorporated into s A . In this sense Eq. 
A.l can be written as 

Na + r° iJTa 

N l = A s T m AnYl / ^;PA((lgN e ,\gN t ;) i \\gE)d\gE (A.3) 
A=1 J dig hi 



with effective solid angle AQ. The mentioned integration over zenith angle is 
now included in the shower fluctuations. 

Using the abbreviation 

IgEj+AlgE 

xf = A s T m AQ J j^E dlgE (A ' 4) 



the integral can be written as a sum over n energy intervals of width A lg E 
with lg Ej denoting the lower bin edges: 

N A n 

• v < X X "k! ( A - 5 ) 

A=l j=l 



Here the matrix element Rf, is defined by 



IgEj+A IgE 



I ^PAmN e ,\gN t ;) i \\gE)d\gE 

I — — - d lg E 
J d\gE B 
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For small bin width A lg E the value of the matrix elements Rfj are not sen- 
sitive to the correct shape of the differential energy spectra dJ^/dlgE. A 
decoupling between the matrix element and the fluxes dJ^jdXgE is then 
achieved. In the present analysis a bin width of A\gE = 0.1 is chosen which 
turns out to be sufficiently small. 

Introducing the m-dimensional data vector Y which contains the m cell con- 
tents Ni of the two-dimensional shower size spectrum, the relation between 
data and energy spectra can be written as 



N A 



Y = R A X A with X A = 



A=l 



/A 



,}■ Q 



v- ) 



and Y 



N 2 

V ) 



(A.7) 



with the elements of the matrix R A defined by Eq. (A. 6). For a more compact 
notation the summation over different primaries can be incorporated into the 
matrix equation by defining the response matrix R and the vector of unknowns 
X schematically through 



R = (R 1 R 2 . . .) and X 



x 2 

V ) 



(At 



where the response matrix R is a block matrix consisting of the response 
matrices R A for the individual particles. Adopting this notation yields for 
Eq. (A.7) the simple expression 

Y = KX. (A.9) 



A. 2 Unfolding methods used 



A. 2.1 The Gold algorithm 

For the application of the Gold algorithm [25] a slight modification of Eq. (A.9) 

— * — 

is necessary. A new data vector Y mo( [ and a new response matrix R are defined 
via the diagonal matrix C containing the statistical uncertainties of the data: 

R = R T CCR and Y mod = R T CCF, yielding RX = Y mod . (A.10) 
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In case of existence of a solution of Eq. (A. 10) the Gold algorithm constructs 
iteratively the diagonal matrix D with elements da = Xi/y mo( i t i which yields 
the desired vector X simply as X = DY mo( i. The iterative prescription for the 
components reads 

fc+1 _ X i ymod,i 
X i ~ n 

Rij x j 

where x\ is the estimated solution in the k th iteration step. 
A. 2. 2 Bayesian unfolding 

The unfolding procedure based on the Bayesian theorem [26] constructs, like 
the Gold algorithm, iteratively a matrix P. The elements of P contain the 

— * 

probabilities for the values Xi if the data Y is measured. Here, P is not a 

— * — » — * 

diagonal matrix. The unknown vector X is calculated by X = PY. Since P 
is not a square matrix, one can directly start with Eq. (A. 9). The iterative 
prescription for the components reads 

fc+1 _ 1 \p ^jiXj 

with x\ being the estimated solution after k iteration steps. 
A. 2. 3 Entropy based unfolding 

The entropy based method [27] is a special case of regularized unfolding. The 
basic idea is the minimization of an extented ^-functional with the incorpo- 
ration of some constraints. The modified Xmod~^ unc ^ ona ^ re& ds 

xLd^t ^'^?^ 2 +rS(X) with S(X) = ±x t \n^ (A. 13) 

j=i <j \yj) i=i r t 

with a(yj) being the statistical uncertainty of the data element yj. S(X) is the 

— * 

entropy-based functional depending on the solution vector X, r the so-called 
regularization parameter governing the influence of the regularization term S. 
The values are the elements of a reference distribution vector r which can be 
considered as the best guess of the solution X. Depending on the value of the 
regularization parameter r the solution X resembles more or less the reference 
r. In this way a balance between statistical significance of the solution and 



(A.11) 



(A.12) 
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systematic uncertainty due to r is achieved. For the minimization of Eq. (A. 13) 
the MINUIT [49] package was used. 
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B Tabulated values of the all particle spectrum 



Table B.l 

Differential flux values of the all particle energy spectrum for QGSJet 01 and 
SIBYLL 2.1 based analysis. The first column of errors denotes the statistical uncer- 
tainty, the second column the systematic uncertainty. 



energy d.J/dE ± stat. ± syst. (QGSJet) dJ/dE ± stat. ± syst. (SIBYLL) 



[GeVj 


[m 2 s 1 sr 1 GeV 


ii 
\ 




[m s sr Ltev 


ii 
\ 






(b.04 zh U.ZD ± Z.zUJ 


1 n- 
• 1U 


-13 


^ qq i n 01 ^i qi a 
(b.oo zh U.zl zh l.ol J 


1 n- 
• 1U 


-13 


o OA 1 n6 
z.z4-lU 


(o.04 zh U.lo ± U. (0) 


1 n- 
• 1U 


-13 


ak _j_ n i/i ^ n 7^ 
(o.4o zh U.14 zh U./UJ 


1 n- 
• 1U 


-13 


o on i n6 
z.oz-lU 


(~\ en -t- n no. 4- n A(\\ 
yi-.oV) zh U.Uo ± U.4»J 


1 n~ 
• 1U 


-13 


M sn 4- n no -i- n qs^i 
(l.oU zh U.U» zh U.ooJ 


1 n~ 
• 1U 


-13 


O.OO-IU 


(1.U1 ± U.UO ± U.zzJ 


1 n- 
• 1U 


-13 


(l.UU zh U.Uo zh U.zzJ 


1 n- 
• 1U 


-13 


4.4 / -1U 


f a on -L n 07 _i_ 1 nn\ 
^4.yU zh U.z7 ± l.UUJ 


1 n~ 
• 1U 


-14 


f a oi -L n 07 _i_ i noA 
^4.yi zh U.Zi zh l.UZJ 


1 n~ 
• 1U 


-14 


0.DZ-1U 


(z.ov zh U.lo ± U.DbJ 


1 n- 
• 1U 


-14 


(o ftO-Ln a A -X- n c;r;\ 
^z.OZ zh U.14 zh U.DDJ 


1 n- 
• 1U 


-14 


v nc i n6 


( 1 on x n 11 _i_ n i£A 
(l.zU zh U.ll zh U.zbJ 


1 n- 
• 1U 


-14 


(l.ob zh U.1U zh U.zoJ 


1 n- 
• 1U 


-14 


o.yi-iu 


(D.41 zh U.bz zh l.ooj 


1 n- 
• 1U 


-15 


l ' C\ OR _1_ n A(\ _1_ 1 

(O.zb zh U.4b zh l.oUJ 


1 n- 
• 1U 


-15 


1 19 1f|7 
1. 1Z' 1U 


I^Z.ol zh U.oO zh u.oyj 


1 n~ 

• 1U 


-15 


{q eo 4_ n 9a 4- n 7^ 
^o.Do zh U.Zo zh U.lo) 


1 n~ 

• 1U 


-15 


1.4M0 7 


(1.54 zb 0.22 zb 0.33) 


• 10" 


-15 


(1.48 zb 0.14 zb 0.31) 


• 10" 


-15 


1.78-10 7 


(6.24 zb 1.35 zb 1.39) 


• 10" 


-16 


(7.57 =b 0.78 =b 0.16) 


• 10" 


-16 


2.24-10 7 


(3.09 =b 0.78 =b 0.64) 


• 10" 


-16 


(4.05 =b 0.51 =b 0.87) 


• 10" 


-16 


2.82-10 7 


(1.98 zb 0.45 zb 0.43) 


• 10" 


-16 


(1.87 zb 0.23 zb 0.44) 


• 10" 


-16 


3.55-10 7 


(8.10 zb 2.52 zb 1.93) 


• 10" 


-17 


(8.81 zb 0.14 zb 2.38) 


• 10" 


-17 


4.47-10 7 


(4.22 zb 1.16 zb 1.14) 


• 10" 


-17 


(3.65 zb 0.66 zb 1.18) 


• 10" 


-17 


5.62-10 7 


(1.83 =b 0.74 zb 0.79) 


• 10" 


-17 


(2.29 zb 0.45 zb 0.89) 


• 10" 


-17 


7.08-10 7 


(1.37 =b 0.40 =b 0.53) 


• 10" 


-17 


(9.29 zb 2.72 zb 5.38) 


• 10" 


-18 


8.91-10 7 


(6.07 zb 2.87 zb 4.02) 


• 10" 


-18 


(5.81 ±2.07 ±4.31) 


• io- 


-18 
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